A Three-Layered Collocation Extraction Tool and Its Application in China English Studies

نویسندگان

  • Jingxiang Cao
  • Dan Li
  • Degen Huang
چکیده

We design a three-layered collocation extraction tool by integrating syntactic and semantic knowledge and apply it in China English studies. The tool first extracts peripheral collocations in the frequency layer from dependency triples, then extracts semi-peripheral collocations in the syntactic layer by association measures, and last extracts core collocations in the semantic layer with a similar word thesaurus. The syntactic constraints filter out much noise from surface co-occurrences, and the semantic constraints are effective in identifying the very “core” collocations. The tool is applied to automatically extract collocations from a large corpus of China English we compile to explore how China English as a variety of English is nativilized. Then we analyze similarity and difference of the typical China English collocations of a group of verbs. The tool and results can be applied in the compilation of language resources for Chinese-English translation and corpus-based China studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Analysis of Collocation in Arabic-English Translations of the Glorious Quran

The Qur’an is the only holy book of Muslims all around the world. Each person with any religion and language is interested in comprehending and accepting the rules and regulations of their own belief. Translation of the Qur’an is only an attempt to present its meaning. One of the most challenges in translation of the Qur’an is collocation. A collocation is a sequence of words or terms that co-o...

متن کامل

Automatic Extraction of English Collocations and their Chinese - English Bilingual Examples : A Computational Tool for Bilingual Lexicography

This paper describes the procedures involved in developing EXEC, a web-based system which can automatically extract English collocations and their Chinese-English bilingual examples from parallel corpora. The system draws on statistics, dependency parsing, and Chinese-English parallel corpora of more than 13 million English words and 27 million Chinese characters. By taking a word as well as th...

متن کامل

A Tool for Multi-Word CoUocation Extraction and Visualization in MultUingual Corpora

This document describes an implemented system of collocation extraction which is designed as aid to translation and which will be used in a real translation environment. Its main functionalities are: retrieving multi-word collocations from an existing corpus of documents in a given language (only French and English are supported for the time being); visualizing the list of extracted terms and t...

متن کامل

The Comparison of Native English and Persian Elementary School Students’ Performance on Lexical and Grammatical Collocations

The importance and howness of language learning/ acquisition has been a great concern for decades. There are many factors that play important roles in this regard. This research compared the performance of native Persian and English elementary students to see if there is any significant difference between the two groups and which type of collocation they performed better within the groups. For ...

متن کامل

Propagation of a Monopulse in Layered and Inhomogeneous Media Using an Adaptive Multiscale Wavelet Collocation Method

Figure 2: Electric eld in layered and inhomogenous(linear proole) meida. The evolution of the transmitted pulse in hte inhomogenous layer can be seen.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015